The World Conversation: Web Page Metadata Generation From Social Sources

نویسندگان

  • Omar Alonso
  • Sushma Bannur
  • Kartikay Khandelwal
  • Shankar Kalyanaraman
چکیده

Over the past couple of years, social networks such as Twitter and Facebook have become the primary source for consuming information on the Internet. One of the main differentiators of this content from traditional information sources available on the Web is the fact that these social networks surface individuals’ perspectives. When social media users post and share updates with friends and followers, some of those short fragments of text contain a link and a personal comment about the web page, image or video. We are interested in mining the text around those links for a better understanding of what people are saying about the object they are referring to. Capturing the salient keywords from the crowd is rich metadata that we can use to augment a web page. This metadata can be used for many applications like ranking signals, query augmentation, indexing, and for organizing and categorizing content. In this paper, we present a technique called social signatures that given a link to a web page, pulls the most important keywords from the social chatter around it. That is, a high level representation of the web page from a social media perspective. Our findings indicate that the content of social signatures differs compared to those from a web page and therefore provides new insights. This difference is more prominent as the number of link shares increase. To showcase our work, we present the results of processing a dataset that contains around 1 Billion unique URLs shared in Twitter and Facebook over a two month period. We also provide data points that shed some light on the dynamics of content sharing in social media.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Categorisation in Social Media Using Hyperlinks to Structured Data Sources

Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks for categorising the topic of individual posts. We focus our analysis on objects that have related metad...

متن کامل

Topic Classification in Social Media Using Metadata from Hyperlinked Objects

Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks for determining the topic of an individual post. We focus specifically on hyperlinks to objects which ha...

متن کامل

Metadata Extraction and Harvesting: A Comparison of Two Automatic Metadata Generation Applications

This research explores the capabilities of two Dublin Core automatic metadata generation applications, Klarity and DC.dot. The top level Web page for each resource, from a sample of 29 resources obtained from National Institute of Environmental Health Sciences (NIEHS), was submitted to both generators. Results indicate that extraction processing algorithms can contribute to useful automatic met...

متن کامل

Data and Methods for the Production of National Population Estimates: An Overview and Analysis of Available Metadata

Thomas Spoorenberg Translated by: Elham Fathi Statistical Center of Iran Abstract. Official population estimates can be produced using a variety of data sources and methods. These range from the direct extraction of information from continuously updated population registers to procedures for updating the status of a population enumerated previously in a periodic census. Additional sources and ...

متن کامل

Navigating the Web with Query Tags

We propose to integrate various pieces of information about a web page (search queries, social annotations, terms extracted from the pagetext) into a navigational menu. This menu displays an auxiliary set of tags (navigational tags) selected with the goal of helping user navigation. We propose a novel framework (navigational utility) for comparing different tag selection methods. We also invest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015